Data Science Visualization

Communication Is As Important As Everything Else

Data Visualization Tools in R

  1. ggplot2
  2. R Shiny Applications
  3. Plotly
  4. Leaflet

ggplot

Components of the Layered Grammar

  1. Data
  2. Aesthetic mapping
  3. Geometric objects
  4. Statistical transformation
  5. Position adjustment
  6. Scale
  7. Coordinate system
  8. Facets

Data, aesthetic mappings, geometric objects, statistical transformations, and positional adjustment form layers.

ggplot

Create a blank canvas

library(ggplot2)

ggplot()

ggplot

Can assign a variable to a ggplot object

fig <- ggplot()
fig

Data | Aesthetic Mapping | Geometric Objects | Statistical Transformation | Position Adjustment | Scale | Coordinate System | Facets

n <- 100
x <- runif(n, 0, 10)
residuals <- rnorm(n)
y <- x + residuals
df = data.frame(x=x, y=y)

Data must be in a data frame

Data | Aesthetic Mapping | Geometric Objects | Statistical Transformation | Position Adjustment | Scale | Coordinate System | Facets

ggplot(df)
ggplot(data=df)

First argument is your data frame

Data | Aesthetic Mapping | Geometric Objects | Statistical Transformation | Position Adjustment | Scale | Coordinate System | Facets

ggplot(df, aes(x=x, y=y))

Data | Aesthetic Mapping | Geometric Objects | Statistical Transformation | Position Adjustment | Scale | Coordinate System | Facets

ggplot(df, aes(x=x, y=y)) +
  geom_point()

Data | Aesthetic Mapping | Geometric Objects | Statistical Transformation | Position Adjustment | Scale | Coordinate System | Facets

size = sample(x=c(4, 3, 2, 1), size=nrow(df),
              prob=c(.1, .2, .3, .4), replace=T)
group = sample(x = c("a", "b", "c", "d"), size=nrow(df), 
               prob=c(.1, .2, .3, .4), replace=T)
df = cbind(df, size=size, color=group)

ggplot(df, aes(x=x, y=y, size=size, color=group)) + geom_point()

Data | Aesthetic Mapping | Geometric Objects | Statistical Transformation | Position Adjustment | Scale | Coordinate System | Facets

ggplot(df, aes(x=x, y=y)) +
  geom_point(size=.25, color="#00BFC4")

Data | Aesthetic Mapping | Geometric Objects | Statistical Transformation | Position Adjustment | Scale | Coordinate System | Facets

ggplot(df, aes(x=x, y=y, color=group)) + 
  geom_smooth() + geom_point() 

ggplot(df, aes(x=x, y=y)) + 
  geom_smooth() + geom_point(aes(color=group))

Data | Aesthetic Mapping | Geometric Objects | Statistical Transformation | Position Adjustment | Scale | Coordinate System | Facets

ggplot(df, aes(x=x, y=y)) + 
  geom_point() + geom_smooth()  

ggplot(df, aes(x=x, y=y)) + 
  geom_point() + geom_smooth(method=lm)  

Data | Aesthetic Mapping | Geometric Objects | Statistical Transformation | Position Adjustment | Scale | Coordinate System | Facets

ggplot(df, aes(x=x)) +
  geom_histogram()

Data | Aesthetic Mapping | Geometric Objects | Statistical Transformation | Position Adjustment | Scale | Coordinate System | Facets

ggplot(df) + geom_bar(aes(x=group))
ggplot(df) + geom_bar(aes(x=group, y=stat(prop), group=1))

Data | Aesthetic Mapping | Geometric Objects | Statistical Transformation | Positional Adjustment | Scale | Coordinate System | Facets

ggplot(df) + geom_bar(aes(x=group, fill=factor(size)))
ggplot(df) + geom_bar(aes(x=group, fill=factor(size)), position="dodge")

Data | Aesthetic Mapping | Geometric Objects | Statistical Transformation | Positional Adjustment | Scale | Coordinate System | Facets

ggplot(df) + geom_bar(aes(x=group, y=stat(prop), group=1))

ggplot(df) +
  geom_bar(aes(x=group, y=stat(prop), group = 1)) +
  scale_y_continuous(labels=scales::percent)

Data | Aesthetic Mapping | Geometric Objects | Statistical Transformation | Positional Adjustment | Scale | Coordinate System | Facets

df$yExp <- exp(df$y)

ggplot(df, aes(x=x, y=yExp)) +
  geom_point()

ggplot(df, aes(x=x, y=yExp)) +
  geom_point() + scale_y_log10()

Data | Aesthetic Mapping | Geometric Objects | Statistical Transformation | Positional Adjustment | Scale | Coordinate System | Facets

ggplot(df, aes(x=x, y=y)) + geom_point() +
  coord_cartesian()

ggplot(df, aes(x=x, y=y)) + geom_point() +
  coord_cartesian(xlim = c(4, 6))

Data | Aesthetic Mapping | Geometric Objects | Statistical Transformation | Positional Adjustment | Scale | Coordinate System | Facets

library(mapproj)
 
ggplot() + 
geom_polygon(data=map_data("world"), aes(x=long, y=lat, group=group)) +  
  coord_map("orthographic", orientation=c(0, 0, 0))

ggplot() + 
geom_polygon(data=map_data("world"), aes(x=long, y=lat, group=group)) + 
  coord_map("mercator", xlim=c(-180,180))

Data | Aesthetic Mapping | Geometric Objects | Statistical Transformation | Positional Adjustment | Scale | Coordinate System | Facets

ggplot(df, aes(x=x, y=y)) +
  geom_point() +
  facet_grid(~ size)

Components of the Layered Grammar

  1. Data
  2. Aesthetic mapping
  3. Geometric objects
  4. Statistical transformation
  5. Position adjustment
  6. Scale
  7. Coordinate system
  8. Facets

Data, aesthetic mappings, geometric objects, statistical transformations, and positional adjustment form layers.

Labels

ggplot(df, aes(x=x, y=y, color=group)) + geom_point() +
  labs(title="Random Data",
       subtitle="Plotting some randomly generated data",
       x="The x-axis",
       y="The y-axis",
       color="The groups")

Data Visualization

  1. ggplot
  2. R Shiny
  3. Plotly
  4. Leaflet

R Shiny

  • R package that makes it easy to build interactive web apps using R code.
  • Make interactive plots in R markdown or on a web page.
  • Build an online dashboard.

R Shiny

Three main components:

  1. User Interface: Controls layout and appearance
  2. Server: Instructions for building app
  3. ShinyApp: Builds app with the user interface and server

R Shiny

library(shiny)
ui <- fluidPage()
server <- function(input, output) {}
shinyApp(ui = ui, server = server)

This code must go into a file named app.R or into an R markdown cell.

Let’s Look At An Example

R Shiny

What to do with shiny app?

Plotly

  • Another way to make interactive plots
  • Wrappers written in several languages

Plotly

library(plotly)

p <- ggplot(df, aes(x=x, y=y)) + 
  geom_point()  
fig = ggplotly(p)
fig

Plotly

fig = plot_ly() %>%
  add_trace(data=df, x=~x, y=~y, mode="markers")
fig

Plotly

df[nrow(df),] = list(x=1,y=10)
fig = plot_ly() %>%
  add_trace(data=df, x=~x, y=~y, mode="markers", text=rownames(df))
fig

Leaflet

library(leaflet)

leaflet() %>%
  addTiles() %>%  
  addMarkers(lng=174.768, lat=-36.852, popup="The birthplace of R")

Data Visualization

  1. ggplot
  2. R Shiny
  3. Plotly
  4. Leaflet